首页> 外文OA文献 >Scalable MCMC for Large Data Problems using Data Subsampling and the Difference Estimator

【2h】

Scalable MCMC for Large Data Problems using Data Subsampling and the Difference Estimator

机译：使用数据子采样和大数据问题的可扩展mCmC用于大数据问题差异估计

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a generic Markov Chain Monte Carlo (MCMC) algorithm to speed upcomputations for datasets with many observations. A key feature of our approachis the use of the highly efficient difference estimator from the surveysampling literature to estimate the log-likelihood accurately using only asmall fraction of the data. Our algorithm improves on the $O(n)$ complexity ofregular MCMC by operating over local data clusters instead of the full samplewhen computing the likelihood. The likelihood estimate is used in aPseudo-marginal framework to sample from a perturbed posterior which is within$O(m^{-1/2})$ of the true posterior, where $m$ is the subsample size. Themethod is applied to a logistic regression model to predict firm bankruptcy fora large data set. We document a significant speed up in comparison to thestandard MCMC on the full dataset.

机译：我们提出了一种通用的马尔可夫链蒙特卡洛（MCMC）算法，以加快具有许多观测值的数据集的计算速度。我们方法的一个关键特征是使用调查抽样文献中的高效差异估计器，仅使用一小部分数据即可准确估计对数似然率。我们的算法通过在本地数据集群上进行操作（而不是在计算似然时使用完整样本），提高了常规MCMC的$ O（n）$复杂度。在伪边际框架中使用似然估计来从处于真实后验的O（m ^ {-1/2}）$之内的扰动后验采样，其中$ m $是子样本大小。该方法应用于逻辑回归模型，以预测大型数据集的公司破产情况。与完整数据集上的标准MCMC相比，我们记录了显着的速度提升。

著录项

作者
Quiroz, Matias; Villani, Mattias; Kohn, Robert;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Speeding Up MCMC by Efficient Data Subsampling [J] . Quiroz Matias, Kohn Robert, Villani Mattias, Journal of the American statistical association . 2019,第526期

机译：通过高效的数据二次采样加速MCMC
2. Speeding Up MCMC by Efficient Data Subsampling [J] . Quiroz Matias, Kohn Robert, Villani Mattias, Journal of the American statistical association . 2019,第526期

机译：通过高效的数据限制来加速MCMC
3. Speeding up MCMC by delayed acceptance and data subsampling [J] . Quality Control and Applied Statistics . 2019,第5a6期

机译：通过延迟接受和数据分配来加速MCMC
4. How can subsampling reduce complexity in sequential MCMC methods and deal with big data in target tracking? [C] . De Freitas Allan, Septier Francois, Mihaylova Lyudmila, International Conference on Information Fusion . 2015

机译：二次采样如何降低顺序MCMC方法的复杂性并在目标跟踪中处理大数据？
5. Divide and Recombine for Large and Complex Data: Model Likelihood Functions Using MCMC and TRMM Big Data Analysis [D] . Liu, Qi. 2018

机译：大数据和复杂数据的划分和重组：使用MCMC和TRMM大数据分析的模型似然函数
6. BESSiE: a software for linear model BLUP and Bayesian MCMC analysis of large-scale genomic data [O] . Vinzent Boerner, Bruce Tier 2016

机译：BESSiE：用于线性模型BLUP和贝叶斯MCMC分析的大规模基因组数据的软件
7. Scalable MCMC for large data problems using data subsampling and the difference estimator [O] . Quiroz Matias, Villani Mattias, Kohn Robert 2015

机译：可扩展的mCmC，用于使用数据子采样和差异估计器的大数据问题

Scalable MCMC for Large Data Problems using Data Subsampling and the Difference Estimator

摘要

著录项

相似文献

相关主题

期刊订阅